A Mixed Model for Cross Lingual Opinion Analysis

نویسندگان

  • Lin Gui
  • Ruifeng Xu
  • Jun Xu
  • Li Yuan
  • Yuanlin Yao
  • Jiyun Zhou
  • Qiaoyun Qiu
  • Shuwei Wang
  • Kam-Fai Wong
  • Ricky Cheung
چکیده

The performances of machine learning based opinion analysis systems are always puzzled by the insufficient training opinion corpus. Such problem becomes more serious for the resource-poor languages. Thus, the cross-lingual opinion analysis (CLOA) technique, which leverages opinion resources on one (source) language to another (target) language for improving the opinion analysis on target language, attracts more research interests. Currently, the transfer learning based CLOA approach sometimes falls to over fitting on single language resource, while the performance of the co-training based CLOA approach always achieves limited improvement during bi-lingual decision. Target to these problems, in this study, we propose a mixed CLOA model, which estimates the confidence of each monolingual opinion analysis system by using their training errors through bilingual transfer self-training and co-training, respectively. By using the weighted average distances between samples and classification hyper-planes as the confidence, the opinion polarity of testing samples are classified. The evaluations on NLP&CC 2013 CLOA bakeoff dataset show that this approach achieves the best performance, which outperforms transfer learning and co-training based approaches.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Instance Level Transfer Learning for Cross Lingual Opinion Analysis

This paper presents two instance-level transfer learning based algorithms for cross lingual opinion analysis by transferring useful translated opinion examples from other languages as the supplementary training data for improving the opinion classifier in target language. Starting from the union of small training data in target language and large translated examples in other languages, the Tran...

متن کامل

Aligning Opinions: Cross-Lingual Opinion Mining with Dependencies

We propose a cross-lingual framework for fine-grained opinion mining using bitext projection. The only requirements are a running system in a source language and word-aligned parallel data. Our method projects opinion frames from the source to the target language, and then trains a system on the target language using the automatic annotations. Key to our approach is a novel dependency-based mod...

متن کامل

A Multi-lingual Annotated Dataset for Aspect-Oriented Opinion Mining

We present the Trip-MAML dataset, a Multi-Lingual dataset of hotel reviews that have been manually annotated at the sentence-level with Multi-Aspect sentiment labels. This dataset has been built as an extension of an existent English-only dataset, adding documents written in Italian and Spanish. We detail the dataset construction process, covering the data gathering, selection, and annotation. ...

متن کامل

English-Persian Plagiarism Detection based on a Semantic Approach

Plagiarism which is defined as “the wrongful appropriation of other writers’ or authors’ works and ideas without citing or informing them” poses a major challenge to knowledge spread publication. Plagiarism has been placed in four categories of direct, paraphrasing (rewriting), translation, and combinatory. This paper addresses translational plagiarism which is sometimes referred to as cross-li...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013